En el presente documento se pretende ir comentando los pasos que se están siguiendo para conseguir predecir la cantidad de agua que llegará a la presa de belesar. Para ello se cuenta con la predicción meteorológica que diariamente se realiza con WRF.
Para la elaboracion del modelo predictivo se cuenta con un Histórico de datos 15 minutales aportados por la presa de Belesar en la que se aportan datos como Aportación, Nivel del embalse, Caudal turbinado, lluvia y algunas otras variables.
DHI<- readRDS(here::here('Data/Parques/Belesar/Historico/Historico_DHI_Belesar.RDS'))
head(DHI)
## DATE TEMPERATURA (ºC) LLUVIA ACUMULADA DÍA (l/m2)
## 3 2018-01-01 00:00:00 6.12 16.83
## 4 2018-01-01 00:15:00 5.44 0.00
## 5 2018-01-01 00:30:00 5.81 0.00
## 6 2018-01-01 00:45:00 6.11 0.00
## 7 2018-01-01 01:00:00 6.22 0.00
## 8 2018-01-01 01:15:00 5.97 0.00
## NIVEL EMBALSE (msnm) APORTACION (m3/s) CAUDAL TURBINADO (m3/s)
## 3 297.75 114.93 0
## 4 297.76 114.48 0
## 5 297.77 116.16 0
## 6 297.78 116.18 0
## 7 297.78 32.40 0
## 8 297.79 116.23 0
## Q. TURB. BCE. (m3/s)
## 3 17.60
## 4 17.03
## 5 16.88
## 6 16.88
## 7 16.93
## 8 16.90
Uno de los problemas que nos encontramos a la hora de tratar los datos es la presencia de valores atípicos o “outliers”. modo de ejemplo podemos observar el siguiente gráfico de la aportación… donde se aprecia un valor negativo enorme y erróneo sin ninguna duda.
plot(DHI$`APORTACION (m3/s)`,
xlab = "Date",
ylab= "Aportacion [m³/s]")
Investigando sobre la mejor manera de eliminar outliers encontre una función hecha por un fanático. que funciona de la siguiente manera y la verdad que me gusta.
outlierKD <- function(dt, var) {
var_name <- eval(substitute(var),eval(dt))
tot <- sum(!is.na(var_name))
na1 <- sum(is.na(var_name))
m1 <- mean(var_name, na.rm = T)
par(mar = rep(2, 4))
par(mfrow=c(2, 2), oma=c(0,0,3,0))
boxplot(var_name, main="With outliers")
hist(var_name, main="With outliers", xlab=NA, ylab=NA)
outlier <- boxplot.stats(var_name)$out
mo <- mean(outlier)
var_name <- ifelse(var_name %in% outlier, NA, var_name)
boxplot(var_name, main="Without outliers")
hist(var_name, main="Without outliers", xlab=NA, ylab=NA)
title("Outlier Check", outer=TRUE)
na2 <- sum(is.na(var_name))
message("Outliers identified: ", na2 - na1, " from ", tot, " observations")
message("Proportion (%) of outliers: ", (na2 - na1) / tot*100)
message("Mean of the outliers: ", mo)
m2 <- mean(var_name, na.rm = T)
message("Mean without removing outliers: ", m1)
message("Mean if we remove outliers: ", m2)
dt[as.character(substitute(var))] <- invisible(var_name)
assign(as.character(as.list(match.call())$dt), dt, envir = .GlobalEnv)
message("Outliers successfully removed", "\n")
return(invisible(dt))
}
Eliminamos valores negativos y pasamos el filtro para outliers.
DHI$`APORTACION (m3/s)`[which(DHI$`APORTACION (m3/s)`<0)]<- NA
outlierKD(DHI,`APORTACION (m3/s)`)
## Outliers identified: 826 from 34552 observations
## Proportion (%) of outliers: 2.39059967585089
## Mean of the outliers: 1075.81016949153
## Mean without removing outliers: 131.482568013429
## Mean if we remove outliers: 108.354577773824
## Outliers successfully removed
outlierKD(DHI,`APORTACION (m3/s)`)
## Outliers identified: 317 from 33726 observations
## Proportion (%) of outliers: 0.939927652256419
## Mean of the outliers: 410.450378548896
## Mean without removing outliers: 108.354577773824
## Mean if we remove outliers: 105.488153491574
## Outliers successfully removed
outlierKD(DHI,`APORTACION (m3/s)`)
## Outliers identified: 228 from 33409 observations
## Proportion (%) of outliers: 0.682450836600916
## Mean of the outliers: 391.157280701754
## Mean without removing outliers: 105.488153491574
## Mean if we remove outliers: 103.525205991381
## Outliers successfully removed
outlierKD(DHI,`APORTACION (m3/s)`)
## Outliers identified: 159 from 33181 observations
## Proportion (%) of outliers: 0.479189897833097
## Mean of the outliers: 380.017798742138
## Mean without removing outliers: 103.525205991381
## Mean if we remove outliers: 102.193901944158
## Outliers successfully removed
outlierKD(DHI,`APORTACION (m3/s)`)
## Outliers identified: 168 from 33022 observations
## Proportion (%) of outliers: 0.5087517412634
## Mean of the outliers: 373.650892857143
## Mean without removing outliers: 102.193901944158
## Mean if we remove outliers: 100.805797771961
## Outliers successfully removed
outlierKD(DHI,`APORTACION (m3/s)`)
## Outliers identified: 50 from 32854 observations
## Proportion (%) of outliers: 0.152188470201498
## Mean of the outliers: 365.8664
## Mean without removing outliers: 100.805797771961
## Mean if we remove outliers: 100.40179124497
## Outliers successfully removed
outlierKD(DHI,`APORTACION (m3/s)`)
## Outliers identified: 4 from 32804 observations
## Proportion (%) of outliers: 0.0121936349225704
## Mean of the outliers: 362.2675
## Mean without removing outliers: 100.40179124497
## Mean if we remove outliers: 100.369856402439
## Outliers successfully removed
outlierKD(DHI,`APORTACION (m3/s)`)
## Outliers identified: 1 from 32800 observations
## Proportion (%) of outliers: 0.00304878048780488
## Mean of the outliers: 361.84
## Mean without removing outliers: 100.369856402439
## Mean if we remove outliers: 100.361884508674
## Outliers successfully removed
outlierKD(DHI,`APORTACION (m3/s)`)
## Outliers identified: 0 from 32799 observations
## Proportion (%) of outliers: 0
## Mean of the outliers: NaN
## Mean without removing outliers: 100.361884508674
## Mean if we remove outliers: 100.361884508674
## Outliers successfully removed
plot(DHI$`APORTACION (m3/s)`,
xlab = "Date",
ylab= "Aportacion [m³/s]",
type = "l")
Aplicando esta funcion a los datos de aportación vemos como se han eliminado valores… pero observando los datos… nos damos cuenta de que nuestros datos siguen teniendo demasiado “ruido”. Antes de proseguir hay que rellenar los huecos que hemos ido generando.
Tras hacer la limpieza de los datos, se han generado gran cantidad de NA’s dentro de nuestro dataset.
sum(is.na(DHI$`APORTACION (m3/s)`))
## [1] 7180
A continuación añadimos datos interpolando. Con el paquete imputeTS se puede añadir los datos que faltan interpolando de maneras diferentes:
library(imputeTS)
#Rellenamos Huecos
DHI$`APORTACION (m3/s)`<- na.interpolation(DHI$`APORTACION (m3/s)`)
#Comprobamos
sum(is.na(DHI$`APORTACION (m3/s)`))
## [1] 0
Para eliminarlo aplicamos una moving average. Tenemos en cuenta de que tenemos un total de 96 datos diarios. probamos la Moving Average para varios valores.
library(TTR)
for(n in c(1,seq(48,192,length.out = 4))){
Mavg<- SMA(DHI$`APORTACION (m3/s)`, n)
plot(DHI$DATE, Mavg,
xlab = "Date",
ylab= "Aportacion [m³/s]",
type = "l",
main = paste0("Aportación en Belesar. MA= ", n))
}
Se puede apreciar como de este mode se elimina bastante ruido de la señal de aportación. Pero no sé hasta que punto estamos respetando los datos… Para continuar se decide que utilizaremos una Moving Average con 96 puntos(cantidad de datos generados en 1 dia)… que sería algo parecido a una media diaria (aunque no es lo mismo).
Tras ejecutar la funcion en buscar de outliers vemos que no hay que en este dataset hay menos datos erroneos.
outlierKD(DHI,`NIVEL EMBALSE (msnm)`)
## Outliers identified: 1 from 39978 observations
## Proportion (%) of outliers: 0.00250137575666617
## Mean of the outliers: 0
## Mean without removing outliers: 310.775323928161
## Mean if we remove outliers: 310.783097781224
## Outliers successfully removed
DHI$`NIVEL EMBALSE (msnm)`<- na.interpolation(DHI$`NIVEL EMBALSE (msnm)`)
A continuación comprobaremos la correlacion que existe entre la aportacion y la variacion del nivel de la presa de Belesar.
x<- 96
aportacion_SMA<- SMA(DHI$`APORTACION (m3/s)`,x)
nivel_SMA<- SMA(c(0,diff(DHI$`NIVEL EMBALSE (msnm)`)*8000+150), x)
aportacion_SMA<- aportacion_SMA[!is.na(aportacion_SMA)]
nivel_SMA<- nivel_SMA[!is.na(nivel_SMA)]
plot(aportacion_SMA,type = "l",
xlab =" " ,ylab = "",xaxt= 'n',yaxt='n',
main=paste0("Variacion del nivel y aportacion\n Correlación de: ",
round(cor(aportacion_SMA, nivel_SMA), 3)))
lines(nivel_SMA, col = "red")
Otro análisis previo que se puede hacer es el de la crosscorrelation… para observar con que desfase se obtiene la mejor correlation. Esto es super útil para analizar correlacione de variables que estan desfasadas en el tiempo.
ccf_belesar<- ccf(aportacion_SMA, nivel_SMA, lag.max = 5000)
#Máxima correlación
max(ccf_belesar$acf)
## [1] 0.5892262
#Con cuanto desfase se produce la máxima correlación.
ccf_belesar$lag[which.max(ccf_belesar$acf)]
## [1] 34
De esta manera podemos darnos cuenta de que estas variables obtienen la mejor correlacion (0.58) para un retraso en la aportación de 34 valores, teniendo en cuenta que nuestro dataset es de datos 15 minutales esto supone 9 horas, es decir, los datos de aportación influyen en el nivel de la presa con un retardo de medio día.
A continuación construiremos una tabla con las variables aportacion y nivel obtenidas por diferentes métodos:
#Retrasamos hacia detrás todos los datos de lluvia porque la acumulada de toda la hora la suma en la hora siguiente....
DHI$`LLUVIA ACUMULADA DÍA (l/m2)`<- lead(DHI$`LLUVIA ACUMULADA DÍA (l/m2)`)
Desacumular_lluvia_2018<- DHI[which(year(DHI$DATE)==2018),] %>% group_by(yday(DATE)) %>% mutate(desacumulada= c(diff(`LLUVIA ACUMULADA DÍA (l/m2)`),0),
lluvia=ifelse(desacumulada>=0, desacumulada, 0))
Desacumular_lluvia_2019<- DHI[which(year(DHI$DATE)==2019),] %>% group_by(yday(DATE)) %>% mutate(desacumulada= c(diff(`LLUVIA ACUMULADA DÍA (l/m2)`),0),
lluvia=ifelse(desacumulada>=0, desacumulada, 0))
Desacumular_lluvia<- rbind(Desacumular_lluvia_2018, Desacumular_lluvia_2019)
Medias_horarias_2018<- as.data.frame(Desacumular_lluvia[year(Desacumular_lluvia$DATE)==2018,]) %>% group_by(yday(DATE), hour(DATE)) %>%
summarize(., Acum_horaria=sum(lluvia, na.rm = T),
aport_mean=mean(`APORTACION (m3/s)`, na.rm = T),
nivel_mean=mean(`NIVEL EMBALSE (msnm)`, na.rm = T))
Medias_horarias_2019<- as.data.frame(Desacumular_lluvia[year(Desacumular_lluvia$DATE)==2019,]) %>% group_by(yday(DATE),hour(DATE)) %>%
summarize(., Acum_horaria=sum(lluvia, na.rm = T),
aport_mean=mean(`APORTACION (m3/s)`, na.rm = T),
nivel_mean=mean(`NIVEL EMBALSE (msnm)`, na.rm = T))
Medias_horarias<- rbind(Medias_horarias_2018, Medias_horarias_2019)
vector_Date<- seq(range(DHI$DATE)[1],range(DHI$DATE)[2],
by="hour")
Lluvia_acum_horaria<- as.data.frame(cbind(as.character(vector_Date[2:length(vector_Date)]),
Medias_horarias[,3:5]))
Lluvia_acum_horaria$`as.character(vector_Date[2:length(vector_Date)])`<- ymd_hms(Lluvia_acum_horaria$`as.character(vector_Date[2:length(vector_Date)])`)
colnames(Lluvia_acum_horaria)<- c("Date", "Lluvia_mm", "aport_mean", "nivel_mean")
Comprobacoin de máxima correlacion desplazada para diferentes valores de SMA.
for (n in seq(6,24*3, by=6)) {
y<- n
aportacion_mean_SMA<- SMA(Lluvia_acum_horaria$aport_mean,y)
nivel_mean_SMA<- SMA(Lluvia_acum_horaria$nivel_mean,y)
Tabla_DHI<- as.data.frame(cbind(Lluvia_acum_horaria,
aportacion_mean_SMA,
nivel_mean_SMA))
colnames(Tabla_DHI)<- c(names(Lluvia_acum_horaria), "aport_mean_SMA", "nivel_mean_SMA")
Tabla_DHI<- Tabla_DHI[complete.cases(Tabla_DHI),]
ccf_aport_mean<- ccf(Tabla_DHI$aport_mean_SMA,
Tabla_DHI$nivel_mean_SMA,
lag.max = 5000,
plot = T)
print(paste0("Máxima correlacion de ",
round(max(ccf_aport_mean$acf), digits = 3) ,
" .Para un desfase de: ",
round(ccf_aport_mean$lag[which.max(ccf_aport_mean$acf)]/24, digits = 2),
" días. Con un SMA de :",
n,
" horas"))
}
## [1] "Máxima correlacion de 0.609 .Para un desfase de: -76.79 días. Con un SMA de :6 horas"
## [1] "Máxima correlacion de 0.616 .Para un desfase de: -76.58 días. Con un SMA de :12 horas"
## [1] "Máxima correlacion de 0.621 .Para un desfase de: -76.42 días. Con un SMA de :18 horas"
## [1] "Máxima correlacion de 0.624 .Para un desfase de: -76.38 días. Con un SMA de :24 horas"
## [1] "Máxima correlacion de 0.627 .Para un desfase de: -76.29 días. Con un SMA de :30 horas"
## [1] "Máxima correlacion de 0.629 .Para un desfase de: -76.25 días. Con un SMA de :36 horas"
## [1] "Máxima correlacion de 0.632 .Para un desfase de: -76.21 días. Con un SMA de :42 horas"
## [1] "Máxima correlacion de 0.634 .Para un desfase de: -76.08 días. Con un SMA de :48 horas"
## [1] "Máxima correlacion de 0.636 .Para un desfase de: -75.92 días. Con un SMA de :54 horas"
## [1] "Máxima correlacion de 0.638 .Para un desfase de: -75.88 días. Con un SMA de :60 horas"
## [1] "Máxima correlacion de 0.64 .Para un desfase de: -75.79 días. Con un SMA de :66 horas"
## [1] "Máxima correlacion de 0.642 .Para un desfase de: -75.67 días. Con un SMA de :72 horas"
Se puede observar como casi todos los casos probando diferentes SMA’s llegan a la conclusión de que la correlación máxima se tiene para 3 días y poco entre aportación y nivel… Se logra mejorar la correlacion aumentando el SMA. La mejor correlación aparece para un desfase de entorno a 75 días. Lo cual no tiene mucho sentido. o por lo menos a nosotros no nos vale…
Añadimos ahora los SMA’s de todo el dataset y el SMA sobre la media horaria.
#se selecciona 96 porque es la cantidad de datos que corresponden a 1 día.
x<- 96
aportacion_SMA<- SMA(DHI$`APORTACION (m3/s)`,x)
nivel_SMA<- SMA(DHI$`NIVEL EMBALSE (msnm)`, x)
aportacion_horaria<- aportacion_SMA[DHI$DATE%in%vector_Date]
nivel_horario<- nivel_SMA[DHI$DATE%in%vector_Date]
#Se ponen 24 horas para que equivalgan a 1 día
y<- 24
aportacion_mean_SMA<- SMA(Lluvia_acum_horaria$aport_mean,y)
nivel_mean_SMA<- SMA(Lluvia_acum_horaria$nivel_mean,y)
Tabla_DHI<- as.data.frame(cbind(Lluvia_acum_horaria,
aportacion_horaria[2:length(aportacion_horaria)],
nivel_horario[2:length(nivel_horario)],
aportacion_mean_SMA,
nivel_mean_SMA))
colnames(Tabla_DHI)<- c(names(Lluvia_acum_horaria), "aport_SMA", "nivel_SMA",
"aport_mean_SMA", "nivel_mean_SMA")
Tabla_DHI<- Tabla_DHI[complete.cases(Tabla_DHI),]
A continuación representamos la aportacion y el nivel. para los tres casos
k<- max(Tabla_DHI$nivel_mean_SMA)/max(Tabla_DHI$aport_mean_SMA)
ggplot(data=Tabla_DHI, aes(x=Date))+
geom_line(aes(y=aport_mean_SMA))+
xlab("Date")+ylab("Aportación [m³/s]")+
theme(panel.background = element_blank(),
panel.grid = element_blank())+
geom_line(aes(y=nivel_mean_SMA/k),group=1, color="red") +
scale_y_continuous(sec.axis = sec_axis(trans = ~ . / (1/k), name = "Nivel [msnm]",
breaks = seq(min(Tabla_DHI$nivel_mean_SMA),
max(Tabla_DHI$nivel_mean_SMA),
by=10)),
breaks = seq(min(Tabla_DHI$aport_mean_SMA),
max(Tabla_DHI$aport_mean_SMA),
by=20)) +
ggtitle("Nivel y aportación. SMA (24 horas) de la media horaria")
ccf_DHI<- ccf(Tabla_DHI$aport_mean_SMA,
Tabla_DHI$nivel_mean_SMA,
lag.max = 1000)
print(paste0("Máxima correlacion de ",
round(max(ccf_DHI$acf), digits = 3) ,
" .Para un desfase de: ",
round(ccf_DHI$lag[which.max(ccf_DHI$acf)]/24, digits = 2),
" días"))
## [1] "Máxima correlacion de 0.583 .Para un desfase de: -41.67 días"
k<- max(Tabla_DHI$nivel_SMA)/max(Tabla_DHI$aport_SMA)
ggplot(data=Tabla_DHI, aes(x=Date))+
geom_line(aes(y=aport_SMA))+
xlab("Date")+ylab("Aportación [m³/s]")+
theme(panel.background = element_blank(),
panel.grid = element_blank())+
geom_line(aes(y=nivel_SMA/k),group=1, color="red") +
scale_y_continuous(sec.axis = sec_axis(trans = ~ . / (1/k), name = "Nivel [msnm]",
breaks = seq(min(Tabla_DHI$nivel_SMA),
max(Tabla_DHI$nivel_SMA),
by=10)),
breaks = seq(min(Tabla_DHI$aport_SMA),
max(Tabla_DHI$aport_SMA),
by=20)) +
ggtitle("Nivel y aportación. SMA (96 datos = 1 día) de los datos 15-minutales")
ccf_DHI<- ccf(Tabla_DHI$aport_SMA,
Tabla_DHI$nivel_SMA,
lag.max = 4000)
print(paste0("Máxima correlacion de ",
round(max(ccf_DHI$acf), digits = 3) ,
" .Para un desfase de: ",
round(ccf_DHI$lag[which.max(ccf_DHI$acf)]/24, digits = 2),
" días"))
## [1] "Máxima correlacion de 0.614 .Para un desfase de: -61.42 días"
k<- max(Tabla_DHI$nivel_mean)/max(Tabla_DHI$aport_mean)
ggplot(data=Tabla_DHI, aes(x=Date))+
geom_line(aes(y=aport_mean))+
xlab("Date")+ylab("Aportación [m³/s]")+
theme(panel.background = element_blank(),
panel.grid = element_blank())+
geom_line(aes(y=nivel_mean/k),group=1, color="red") +
scale_y_continuous(sec.axis = sec_axis(trans = ~ . / (1/k), name = "Nivel [msnm]",
breaks = seq(min(Tabla_DHI$nivel_mean),
max(Tabla_DHI$nivel_mean),
by=10)),
breaks = seq(min(Tabla_DHI$aport_mean),
max(Tabla_DHI$aport_mean),
by=20)) +
ggtitle("Nivel y aportación. Medias horarias")
ccf_DHI<- ccf(Tabla_DHI$aport_mean,
Tabla_DHI$nivel_mean,
lag.max = 1000)
print(paste0("Máxima correlacion de ",
round(max(ccf_DHI$acf), digits = 3) ,
" .Para un desfase de: ",
round(ccf_DHI$lag[which.max(ccf_DHI$acf)]/24, digits = 2),
" días"))
## [1] "Máxima correlacion de 0.621 .Para un desfase de: -3.29 días"
Por lo anteriormente expuesto… se llega a la conclusión de que el mejor método para relacionar la aportación y el nivel es haciendo una moving average sobre las medias horarias… un SMA de 142 valores… entorno a 6 dias.
Otra comprobacion que considero pertinente es ver si la lluvia y la aportacion correlan debidamente…
k<- max(Tabla_DHI$Lluvia_mm)/max(Tabla_DHI$aport_mean_SMA)
ggplot(data=Tabla_DHI, aes(x=Date))+
geom_line(aes(y=aport_mean_SMA))+
xlab("Date")+ylab("Aportación [m³/s]")+
theme(panel.background = element_blank(),
panel.grid = element_blank())+
geom_line(aes(y=Lluvia_mm/k),group=1, color="red") +
scale_y_continuous(sec.axis = sec_axis(trans = ~ . / (1/k), name = "Lluvia Horaria [l/m²]",
breaks = seq(min(Tabla_DHI$Lluvia_mm),
max(Tabla_DHI$Lluvia_mm),
by=10)),
breaks = seq(min(Tabla_DHI$aport_mean_SMA),
max(Tabla_DHI$aport_mean_SMA),
by=20)) +
ggtitle("Lluvia y aportacion")
ccf_DHI<- ccf(Tabla_DHI$aport_mean_SMA,
Tabla_DHI$Lluvia_mm,
lag.max = 1000)
print(paste0("Máxima correlacion de ",
round(max(ccf_DHI$acf), digits = 3) ,
" .Para un desfase de: ",
round(ccf_DHI$lag[which.max(ccf_DHI$acf)]/24, digits = 2),
" días"))
## [1] "Máxima correlacion de 0.109 .Para un desfase de: 1.71 días"
Tabla_DHI_cut<- Tabla_DHI[which(month(Tabla_DHI$Date)==12), ]
k<- max(Tabla_DHI_cut$Lluvia_mm)/max(Tabla_DHI_cut$aport_mean_SMA)
ggplot(data=Tabla_DHI_cut, aes(x=Date))+
geom_line(aes(y=aport_mean_SMA))+
xlab("Date")+ylab("Aportación [m³/s]")+
theme(panel.background = element_blank(),
panel.grid = element_blank())+
geom_line(aes(y=Lluvia_mm/k),group=1, color="red") +
scale_y_continuous(sec.axis = sec_axis(trans = ~ . / (1/k), name = "Nivel [msnm]",
breaks = seq(min(Tabla_DHI_cut$Lluvia_mm),
max(Tabla_DHI_cut$Lluvia_mm),
by=10)),
breaks = seq(min(Tabla_DHI_cut$aport_mean_SMA),
max(Tabla_DHI_cut$aport_mean_SMA),
by=20)) +
ggtitle("Lluvia y aportacion. Diciembre")
ccf_DHI<- ccf(Tabla_DHI_cut$aport_mean_SMA,
Tabla_DHI_cut$Lluvia_mm,
lag.max = 1000)
print(paste0("Máxima correlacion de ",
round(max(ccf_DHI$acf), digits = 3) ,
" .Para un desfase de: ",
round(ccf_DHI$lag[which.max(ccf_DHI$acf)]/24, digits = 2),
" días"))
## [1] "Máxima correlacion de 0.065 .Para un desfase de: 17.04 días"
Tabla_DHI_cut<- Tabla_DHI[which(month(Tabla_DHI$Date)==12 | year(Tabla_DHI$Date)==2019), ]
k<- max(Tabla_DHI_cut$Lluvia_mm)/max(Tabla_DHI_cut$aport_mean_SMA)
ggplot(data=Tabla_DHI_cut, aes(x=Date))+
geom_line(aes(y=aport_mean_SMA))+
xlab("Date")+ylab("Aportación [m³/s]")+
theme(panel.background = element_blank(),
panel.grid = element_blank())+
geom_line(aes(y=Lluvia_mm/k),group=1, color="red") +
scale_y_continuous(sec.axis = sec_axis(trans = ~ . / (1/k), name = "Nivel [msnm]",
breaks = seq(min(Tabla_DHI_cut$Lluvia_mm),
max(Tabla_DHI_cut$Lluvia_mm),
by=10)),
breaks = seq(min(Tabla_DHI_cut$aport_mean_SMA),
max(Tabla_DHI_cut$aport_mean_SMA),
by=20)) +
ggtitle("Lluvia y aportacion. Diciembre-Enero-Febrero")
ccf_DHI<- ccf(Tabla_DHI_cut$aport_mean_SMA,
Tabla_DHI_cut$Lluvia_mm,
lag.max = 1000)
print(paste0("Máxima correlacion de ",
round(max(ccf_DHI$acf), digits = 3) ,
" .Para un desfase de: ",
round(ccf_DHI$lag[which.max(ccf_DHI$acf)]/24, digits = 2),
" días"))
## [1] "Máxima correlacion de 0.332 .Para un desfase de: 1.71 días"
Se puede observar como la correlación entre la aportación y la lluvia es apriori muy mala… habrá que hacer trikiñuelas para poder salvar esto… En principio me parece normal que estas variables tengan mala correlación… la aportación del río si que dependerá e la lluvia… pero influyen muchos más factores… además. La lluvia registrada en belesar no tiene porqué ser la mejor para estimar la aportación del río que llega a belesar… puede estar lluviendo muchos kilometros hacia arriba en el valle…
Antes de proseguir guardamos este histórico que hemos construido.
path_dataDHI <- here::here('Data/Parques/Belesar/Historico/DHI_historico_afinado.RDS')
saveRDS(Tabla_DHI,path_dataDHI)
Los datos del histórico de WRF están organizados en formato lista… hay para cada localización 2 dataframes;
A continuación se presenta el loop y las funciones que se emplean para generar todo el histórico WRF de Belesar… Cabe mencionar que contamos con una carpeta con Archivos RDS ordenados por fechas para la localización de belesar.
Usamos las siguientes funciones.
lon_lat_df_ls<- function(parque_list){
fecha<- names(parque_list)
fecha[!str_detect(fecha, " ")]<- paste0(fecha[!str_detect(fecha, " ")], " 00:00:00")
n_lon<- parque_list[[1]]$lon
n_lat<- parque_list[[1]]$lat
nombres<- paste0(n_lon,"__",n_lat)
list_localizaciones<- list()
for (localizaciones in 1:length(parque_list[[1]][,1]) ) {
l<- lapply(parque_list, function(x) x[localizaciones,])
df <- data.frame(matrix(unlist(l), nrow=length(l), byrow=T))
df<- as.data.frame(cbind(ymd_hms(fecha), df))
colnames(df)<-c("fechas", colnames(parque_list[[1]]))
list_localizaciones[[localizaciones]]<- df
}
names(list_localizaciones)<- nombres
return(list_localizaciones)
}
uv_transformation<- function(tabla_comp){
nombres<- colnames(tabla_comp)
u10<- tabla_comp$U10_MEAN
v10<- tabla_comp$V10_MEAN
u10_max<- tabla_comp$U10_MAX
v10_max<- tabla_comp$V10_MAX
wind_abs <- sqrt(u10^2 + v10^2)
wind_dir_rad <- atan2(u10/wind_abs, v10/wind_abs)
wind_dir_deg1 <- wind_dir_rad * 180/pi
wind_dir_deg2 <- wind_dir_deg1+ 180
wind_abs_max <- sqrt(u10_max^2 + v10_max^2)
wind_dir_rad_max <- atan2(u10_max/wind_abs_max, v10_max/wind_abs_max)
wind_dir_deg1_max <- wind_dir_rad_max * 180/pi
wind_dir_deg2_max <- wind_dir_deg1_max+ 180
tabla_comp<- as.data.frame(cbind(tabla_comp,wind_abs,wind_dir_deg2,
wind_abs_max,wind_dir_deg2_max))
colnames(tabla_comp)<- c(nombres,"WS","WD","WS_MAX","WD_MAX")
tabla_comp$U10_MAX<- NULL
tabla_comp$V10_MAX<- NULL
tabla_comp$U10_MEAN<- NULL
tabla_comp$V10_MEAN<- NULL
return(tabla_comp)
}
extract_rain_data<- function(Belesar_lolat_df){
rain<- Belesar_lolat_df[,c(1,2,3,4,5)]
rain$pre_acum<- rain$RAINC+rain$RAINNC
rain$RAINC<- NULL
rain$RAINNC<- NULL
prep_hourly<- vector()
for (i in 1:length(rain$pre_acum)) {
if(i==1){prep_hourly[i]<- rain$pre_acum[i]}else{
prep_hourly[i]<- rain$pre_acum[i]-rain$pre_acum[i-1]
}
}
rain$prep_hourly<- prep_hourly
return(rain)
}
extract_rain_data2<- function(Belesar_lolat_df){
rain<- Belesar_lolat_df[,c(1,2,3,4,5,6,10,17,20)]
rain$pre_acum<- rain$RAINC+rain$RAINNC
prep_hourly<- vector()
for (i in 1:length(rain$pre_acum)) {
if(i==1){prep_hourly[i]<- rain$pre_acum[i]}else{
prep_hourly[i]<- rain$pre_acum[i]-rain$pre_acum[i-1]
}
}
rain$prep_hourly<- prep_hourly
return(rain)
}
Return_periodo_Belesar<- function(){
All_files_Belesar<- list.files(here::here('Data/Parques/Belesar/'),
recursive = F, full.names = T)
RDS_Belesar<- All_files_Belesar[str_detect(All_files_Belesar, ".RDS")]
RDS_Belesar1<- RDS_Belesar[!str_detect(RDS_Belesar, "E001")]
Belesar_data<- readRDS(RDS_Belesar1[1])
Belesar_lolat<- lon_lat_df_ls(Belesar_data)
Belesar_lolat1<- lapply(Belesar_lolat, uv_transformation)
Belesar_rain<- lapply(Belesar_lolat1, extract_rain_data)
fecha_ini<- Belesar_rain$`-8.02328491210938__42.1343421936035`$fechas[1]
Belesar_data<- readRDS(RDS_Belesar1[length(RDS_Belesar)])
Belesar_lolat<- lon_lat_df_ls(Belesar_data)
Belesar_lolat1<- lapply(Belesar_lolat, uv_transformation)
Belesar_rain<- lapply(Belesar_lolat1, extract_rain_data)
fecha_last<- Belesar_rain$`-8.02328491210938__42.1343421936035`$fechas[length(Belesar_rain$`-8.02328491210938__42.1343421936035`$fechas)]
periodo_WRF<- seq(fecha_ini, fecha_last, by="hour")
Tabla_WRF<- as.data.frame(matrix(ncol = 10, nrow = length(periodo_WRF)))
colnames(Tabla_WRF)<- colnames(Belesar_rain[[1]])
Tabla_WRF$fechas<- periodo_WRF
return(Tabla_WRF)
}
#Listamos archivos dentro de la carpeta de Belesar
All_files_Belesar<- list.files(here::here('Data/Parques/Belesar/'),
recursive = F, full.names = T)
#Detectamos cuales son RDS
RDS_Belesar<- All_files_Belesar[str_detect(All_files_Belesar, ".RDS")]
#Eliminamos los RDS que no son de WRF
RDS_Belesar1<- RDS_Belesar[!str_detect(RDS_Belesar, "E001")]
#Construimos Lista para cada instante de tiempo
Lista_localizacion<- list()
for (i in 1:length(RDS_Belesar1)) {
Belesar_data<- readRDS(RDS_Belesar1[i])
Belesar_lolat<- lon_lat_df_ls(Belesar_data)
Belesar_lolat1<- lapply(Belesar_lolat, uv_transformation)
Belesar_rain<- lapply(Belesar_lolat1, extract_rain_data2)
Lista_localizacion[[i]]<- Belesar_rain
}
#Nombramos la lista
names_fechas<- sapply(RDS_Belesar1, function(x){
r<- str_split(x, "/")
return(str_remove(str_remove(r[[1]][length(r[[1]])], ".RDS"), "Belesar_"))
})
names(Lista_localizacion)<- names_fechas
#Creamos una lista por localización
Lista_localizacion2<- list()
for (i in 1:length(Lista_localizacion[[1]])) {
Lista_localizacion2[[i]]<- lapply(Lista_localizacion,
function(x) return(x[[i]]))
}
#Nombramos la lista
names(Lista_localizacion2)<- names(Lista_localizacion[[1]])
#Guardamos la lista
path_lista_total<- here::here('Data/Parques/Belesar/Historico/')
nombre_lista<- paste0(path_lista_total, 'Historico_WRF_Belesar_Variables.RDS')
saveRDS(Lista_localizacion2, nombre_lista)
#Cargamos lista
path_lista_total<- here::here('Data/Parques/Belesar/Historico/')
nombre_lista<- paste0(path_lista_total, 'Historico_WRF_Belesar_Variables.RDS')
Lista_total1<- readRDS(nombre_lista)
#Juntamos todos los Dataframes
Lista_total_MF<- lapply(Lista_total1, function(x) bind_rows(x))
#creamos dos data.frames... uno para D1 y otro para D2
Lista_d1_d2_loc<- list()
for (i in 1:length(Lista_total_MF)) {
p<- Lista_total_MF[[i]]
d1<- p[duplicated(p$fechas),]
d1<-d1[!duplicated(d1$fechas),]
d2<- p[!duplicated(p$fechas),]
d2_qneed1<-d2[!(d2$fechas%in%d1$fechas),]
d1_2<-bind_rows(d1,d2_qneed1)
d1_2<-d1_2[order(d1_2$fechas),]
d2<-d2[order(d2$fechas),]
d1_2$pre_acum<- NULL
d2$pre_acum<- NULL
colnames(d1_2)<- c("Date", "LON", "LAT", "RAINC", "RAINNC","RAINSH", "T02_MEAN","PSFC","WS_MAX", "prep_hourly")
colnames(d2)<- c("Date", "LON", "LAT", "RAINC", "RAINNC","RAINSH", "T02_MEAN","PSFC","WS_MAX", "prep_hourly")
lista_loc_d12<- list(d1_2,d2)
names(lista_loc_d12)<- c("D1", "D2")
Lista_d1_d2_loc[[i]]<- lista_loc_d12
}
#nombramos la lsta
names(Lista_d1_d2_loc)<- names(Lista_total_MF)
Tabla_periodo1<- Return_periodo_Belesar()
colnames(Tabla_periodo1)<- c("Date", "LON", "LAT", "RAINC", "RAINNC","RAINSH", "T02_MEAN","PSFC","WS_MAX", "prep_hourly")
Lista_d1_d2_loc2<- list()
for (j in 1:length(Lista_d1_d2_loc)) {
prueba_list<- Lista_d1_d2_loc[[j]]
lista_retorno<- list()
for(i in 1:2){
prueba<- prueba_list[[i]]
Tabla_periodo<- Tabla_periodo1
Tabla_periodo$LON[match(prueba$Date,Tabla_periodo$Date)] <- prueba$LON
Tabla_periodo$LAT[match(prueba$Date,Tabla_periodo$Date)] <- prueba$LAT
Tabla_periodo$RAINC[match(prueba$Date,Tabla_periodo$Date)] <- prueba$RAINC
Tabla_periodo$RAINNC[match(prueba$Date,Tabla_periodo$Date)] <- prueba$RAINNC
Tabla_periodo$RAINSH[match(prueba$Date,Tabla_periodo$Date)] <- prueba$RAINSH
Tabla_periodo$T02_MEAN[match(prueba$Date,Tabla_periodo$Date)] <- prueba$T02_MEAN
Tabla_periodo$PSFC[match(prueba$Date,Tabla_periodo$Date)] <- prueba$PSFC
Tabla_periodo$WS_MAX[match(prueba$Date,Tabla_periodo$Date)] <- prueba$WS_MAX
Tabla_periodo$prep_hourly[match(prueba$Date,Tabla_periodo$Date)] <- prueba$prep_hourly
lista_retorno[[i]]<- Tabla_periodo
}
names(lista_retorno)<- c("D1", "D2")
Lista_d1_d2_loc2[[j]]<- lista_retorno
}
names(Lista_d1_d2_loc2)<- names(Lista_d1_d2_loc)
#Creamos lista con las variables afinadas
Lista_d1_d2_loc3<- lapply(Lista_d1_d2_loc2, function(x){
x[[1]]$RAINC<- NULL
x[[1]]$RAINNC<- NULL
x[[1]]$RAINSH<- NULL
x[[2]]$RAINC<- NULL
x[[2]]$RAINNC<- NULL
x[[2]]$RAINSH<- NULL
return(x)})
names(Lista_d1_d2_loc3)<- names(Lista_d1_d2_loc2)
#Guardamos
path_hist_WRF<- here::here('Data/Parques/Belesar/Historico/Historico_WRF_Belesar_Variables_D1D2.RDS')
saveRDS(Lista_d1_d2_loc3,path_hist_WRF)
Juntamos a cada data.frame de WRF la tabla afinada de DHI.
#Usamos dplyr para juntar ambos datasets
Belesar_WRF<- readRDS(here::here('Data/Parques/Belesar/Historico/Historico_WRF_Belesar_Variables_D1D2.RDS'))
Belesar_DHI<- readRDS(here::here('Data/Parques/Belesar/Historico/DHI_historico_afinado.RDS'))
df2<- Belesar_DHI
Belesar_Merge<- list()
for (j in 1:length(Belesar_WRF)) {
lista_retorno<- list()
for(i in 1:2){
df1<- Belesar_WRF[[j]][[i]]
Merge_table<- left_join(df1, df2, by=c("Date"))
lista_retorno[[i]]<- Merge_table
}
names(lista_retorno)<- c("D1", "D2")
Belesar_Merge[[j]]<- lista_retorno
}
names(Belesar_Merge)<- names(Belesar_WRF)
#Belesar merge completecases
Belesar_Merge_cc<- list()
for (j in 1:length(Belesar_Merge)) {
lista_retorno<- list()
for(i in 1:2){
df1<- Belesar_Merge[[j]][[i]]
df1$Acumulated<- NULL
Table_fine<- df1[complete.cases(df1),]
lista_retorno[[i]]<- Table_fine
}
names(lista_retorno)<- c("D1", "D2")
Belesar_Merge_cc[[j]]<- lista_retorno
}
names(Belesar_Merge_cc)<- names(Belesar_Merge)
#Guardamos
path_hist_WRF<- here::here('Data/Parques/Belesar/Historico/Hist_D1D2_DHI_MERGED.RDS')
saveRDS(Belesar_Merge_cc,path_hist_WRF)
#importar
path_hist_WRF<- here::here('Data/Parques/Belesar/Historico/Hist_D1D2_DHI_MERGED.RDS')
clean_data<- readRDS(path_hist_WRF)
#cortar en entrenamiento y predicción
cut_train<- lapply(clean_data, function(x){
y<- lapply(x, function(r){
fecha_ini<- ymd("2018/10/01")
fecha_end<- ymd("2019/02/01")
Jan_data<- r[which(r$Date<fecha_end & r$Date>fecha_ini),]
return(Jan_data)
})
return(y)
})
cut_predict<- lapply(clean_data, function(x){
y<- lapply(x, function(r){
fecha_ini<- ymd("2019/02/01")
fecha_end<- ymd("2019/02/20")
Jan_data<- r[which(r$Date<fecha_end & r$Date>fecha_ini),]
return(Jan_data)
})
return(y)
})
lista_TL<- list()
for (i in 1:length(cut_train)) {
lista_d1d2<- list()
for (j in 1:2) {
data_predict<- cut_train[[i]][[j]]
data_predict2<- cut_predict[[i]][[j]]
fit_1 <- lm(Lluvia_mm ~ prep_hourly, data = data_predict)
fit_2 <- lm(Lluvia_mm ~ prep_hourly + T02_MEAN, data = data_predict)
fit_3 <- lm(Lluvia_mm ~ prep_hourly + PSFC, data = data_predict)
fit_4 <- lm(Lluvia_mm ~ prep_hourly + WS_MAX, data = data_predict)
fit_5 <- lm(Lluvia_mm ~ prep_hourly * T02_MEAN, data = data_predict)
fit_6 <- lm(Lluvia_mm ~ prep_hourly * PSFC, data = data_predict)
fit_7 <- lm(Lluvia_mm ~ prep_hourly * WS_MAX, data = data_predict)
uncorrected<-data_predict2$prep_hourly
prediction_rain<- predict(fit_1, data.frame(prep_hourly =data_predict2$prep_hourly))
prediction_rain2<- predict(fit_2, data.frame(prep_hourly =data_predict2$prep_hourly,
T02_MEAN =data_predict2$T02_MEAN))
prediction_rain3<- predict(fit_3, data.frame(prep_hourly =data_predict2$prep_hourly,
PSFC =data_predict2$PSFC))
prediction_rain4<- predict(fit_4, data.frame(prep_hourly =data_predict2$prep_hourly,
WS_MAX =data_predict2$WS_MAX))
prediction_rain5<- predict(fit_5, data.frame(prep_hourly =data_predict2$prep_hourly,
T02_MEAN =data_predict2$T02_MEAN))
prediction_rain6<- predict(fit_6, data.frame(prep_hourly =data_predict2$prep_hourly,
PSFC =data_predict2$PSFC))
prediction_rain7<- predict(fit_7, data.frame(prep_hourly =data_predict2$prep_hourly,
WS_MAX =data_predict2$WS_MAX))
observed_rain<-data_predict2$Lluvia_mm
tabla_cor<-cbind(cor(uncorrected, observed_rain),
cor(prediction_rain, observed_rain),
cor(prediction_rain2, observed_rain),
cor(prediction_rain3, observed_rain),
cor(prediction_rain4, observed_rain),
cor(prediction_rain5, observed_rain),
cor(prediction_rain6, observed_rain),
cor(prediction_rain7, observed_rain))
fit_1 <- svm(Lluvia_mm ~ prep_hourly, data = data_predict)
fit_2 <- svm(Lluvia_mm ~ prep_hourly + T02_MEAN, data = data_predict)
fit_3 <- svm(Lluvia_mm ~ prep_hourly + PSFC, data = data_predict)
fit_4 <- svm(Lluvia_mm ~ prep_hourly + WS_MAX, data = data_predict)
fit_5 <- svm(Lluvia_mm ~ prep_hourly * T02_MEAN, data = data_predict)
fit_6 <- svm(Lluvia_mm ~ prep_hourly * PSFC, data = data_predict)
fit_7 <- svm(Lluvia_mm ~ prep_hourly * WS_MAX, data = data_predict)
uncorrected<-data_predict2$prep_hourly
prediction_rain<- predict(fit_1, data.frame(prep_hourly =data_predict2$prep_hourly))
prediction_rain2<- predict(fit_2, data.frame(prep_hourly =data_predict2$prep_hourly,
T02_MEAN =data_predict2$T02_MEAN))
prediction_rain3<- predict(fit_3, data.frame(prep_hourly =data_predict2$prep_hourly,
PSFC =data_predict2$PSFC))
prediction_rain4<- predict(fit_4, data.frame(prep_hourly =data_predict2$prep_hourly,
WS_MAX =data_predict2$WS_MAX))
prediction_rain5<- predict(fit_5, data.frame(prep_hourly =data_predict2$prep_hourly,
T02_MEAN =data_predict2$T02_MEAN))
prediction_rain6<- predict(fit_6, data.frame(prep_hourly =data_predict2$prep_hourly,
PSFC =data_predict2$PSFC))
prediction_rain7<- predict(fit_7, data.frame(prep_hourly =data_predict2$prep_hourly,
WS_MAX =data_predict2$WS_MAX))
observed_rain<-data_predict2$Lluvia_mm
tabla_cor<-cbind(tabla_cor,
cor(prediction_rain, observed_rain),
cor(prediction_rain2, observed_rain),
cor(prediction_rain3, observed_rain),
cor(prediction_rain4, observed_rain),
cor(prediction_rain5, observed_rain),
cor(prediction_rain6, observed_rain),
cor(prediction_rain7, observed_rain))
colnames(tabla_cor)<- c("uncorrected", "1","2", "3","4","5","6","7",
"svm1","svm2", "svm3","svm4","svm5","svm6","svm7")
lista_d1d2[[j]]<- tabla_cor
}
names(lista_d1d2)<- c("D1", "D2")
lista_TL[[i]]<- lista_d1d2
}
colnames(tabla_cor)<- c("uncorrected", "1","2", "3","4","5","6","7",
"svm1","svm2", "svm3","svm4","svm5","svm6","svm7")
names(lista_d1d2)<- c("D1", "D2")
Lista_TL_names<-lapply(lista_TL, function(x){
y<- lapply(x, function(r) {
r<- as.data.frame(r)
names(r)<- c("uncorrected", "1","2", "3","4","5","6","7",
"svm1","svm2", "svm3","svm4","svm5","svm6","svm7")
return(r)
})
names(y)<- c("D1", "D2")
return(y)
})
names(Lista_TL_names)<- names(cut_predict)
x<- lapply(Lista_TL_names, function(x){bind_rows(as.data.frame(x))})
y<- bind_rows(x, .id="id")
cor_table<- as.data.frame(matrix(ncol = 3, nrow = 43))
colnames(cor_table)<- c("id","corr", "name")
for (i in 2:length(y[1,])) {
f<- as.data.frame(cbind(y[which.max(y[,i]), c(1,i)], names( y[which.max(y[,i]), c(1,i)])[2]))
colnames(f)<- c("id","corr", "name")
cor_table<- rbind(cor_table,f )
}
cor_table<- cor_table[complete.cases(cor_table),]
table(cor_table$id)
##
## -7.33221435546875__42.8309211730957 -7.33944702148438__42.9655685424805
## 1 2
## -7.36126708984375__43.3694076538086 -7.478271484375__42.1520957946777
## 3 1
## -7.4857177734375__42.2868041992188 -7.50823974609375__42.6908493041992
## 2 1
## -7.5386962890625__43.2293548583984 -7.54638671875__43.363941192627
## 3 6
## -7.65997314453125__42.1464500427246 -7.6756591796875__42.415828704834
## 1 6
## -7.69937133789062__42.8197975158691 -7.73147583007812__43.3581962585449
## 1 1
## -7.9080810546875__43.2176132202148
## 2
punto_Belesar<- c(42.628577,-7.713948)
extract_lat_lon<- lapply(str_split(cor_table$id,"__"), function(x){
return(as.data.frame(cbind(x[[1]],x[[2]])))
})
extract_lat_lon<- as.data.frame(bind_rows(extract_lat_lon))
## Warning in bind_rows_(x, .id): Unequal factor levels: coercing to character
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): Unequal factor levels: coercing to character
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
cor_table<- as.data.frame(cbind(extract_lat_lon, cor_table[,2:3]))
colnames(cor_table)<- c("LON","LAT", "Corr", "Method")
library(geosphere)
vec_dist<- vector()
for (i in 1:length(cor_table[,1])) {
vec_dist[i]<- distm(c(-7.713948, 42.628577), c(as.numeric(cor_table[i,1]),
as.numeric(cor_table[i,2])),
fun = distHaversine)
}
vec_dist<- round(vec_dist/1000, digits = 1)
cor_table<- as.data.frame(cbind(cor_table, vec_dist))
cor_table
## LON LAT Corr Method vec_dist
## 111 -7.6756591796875 42.415828704834 0.6819757 D1.uncorrected 23.9
## 112 -7.6756591796875 42.415828704834 0.6819757 D1.1 23.9
## 113 -7.6756591796875 42.415828704834 0.6820122 D1.2 23.9
## 44 -7.478271484375 42.1520957946777 0.6410577 D1.3 56.5
## 81 -7.4857177734375 42.2868041992188 0.6616118 D1.4 42.4
## 311 -7.33944702148438 42.9655685424805 0.6313029 D1.5 48.4
## 251 -7.69937133789062 42.8197975158691 0.6326919 D1.6 21.3
## 82 -7.4857177734375 42.2868041992188 0.6713669 D1.7 42.4
## 312 -7.33944702148438 42.9655685424805 0.6078017 D1.svm1 48.4
## 310 -7.65997314453125 42.1464500427246 0.6045748 D1.svm2 53.9
## 271 -7.33221435546875 42.8309211730957 0.6176615 D1.svm3 38.5
## 114 -7.6756591796875 42.415828704834 0.6423457 D1.svm4 23.9
## 221 -7.50823974609375 42.6908493041992 0.6022039 D1.svm5 18.2
## 115 -7.6756591796875 42.415828704834 0.6090482 D1.svm6 23.9
## 116 -7.6756591796875 42.415828704834 0.6496231 D1.svm7 23.9
## 381 -7.5386962890625 43.2293548583984 0.6070733 D2.uncorrected 68.4
## 382 -7.5386962890625 43.2293548583984 0.6070733 D2.1 68.4
## 383 -7.5386962890625 43.2293548583984 0.6069044 D2.2 68.4
## 421 -7.54638671875 43.363941192627 0.6413140 D2.3 83.0
## 422 -7.54638671875 43.363941192627 0.6217031 D2.4 83.0
## 423 -7.54638671875 43.363941192627 0.6109681 D2.5 83.0
## 431 -7.36126708984375 43.3694076538086 0.6452272 D2.6 87.3
## 424 -7.54638671875 43.363941192627 0.6192445 D2.7 83.0
## 361 -7.9080810546875 43.2176132202148 0.5649173 D2.svm1 67.5
## 411 -7.73147583007812 43.3581962585449 0.5711481 D2.svm2 81.2
## 432 -7.36126708984375 43.3694076538086 0.6271874 D2.svm3 87.3
## 425 -7.54638671875 43.363941192627 0.6902568 D2.svm4 83.0
## 362 -7.9080810546875 43.2176132202148 0.5670526 D2.svm5 67.5
## 433 -7.36126708984375 43.3694076538086 0.6068788 D2.svm6 87.3
## 426 -7.54638671875 43.363941192627 0.6877907 D2.svm7 83.0